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1. INTRODUCTION 

The health records of patients are put away in local databases or the Cloud. From one perspective 
such data can be advantageously obtained from the internet from any place at any time by physicians, 
insurance specialists, healthcare institutes, and researchers particularly during pandemics such as COVID-19. 
The whole globe is forced to lock-down as a consequence of the COVID-19 pandemic. During such a crisis, 
health care institutes and researchers are working to find a cure or vaccines through analyzing and exploring 
the records of COVID-19 patients. The patient record collection, analysis, processing, and up-to-date sharing 
of such valuable data records are an essential part for researchers to collaborate to find vaccines or a cure for 
this pandemic. 

In addition to the health care crisis, the patient records that include personal and medical data 
become appealing and can become the focus for intruders. Thus, protecting such data or records face an ever- 
increasing number of risks. For example, the attacker might use data mining methods to profile patients based 
on their healthcare records. Additionally, such records stored in local data storage or the cloud can be 
exploited by system administrators to bargain or sell the patient's data. With these threats, there are 
undeniably growing open concerns about the privacy and security of patients’ records [1], [2]. 
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Furthermore, governments and healthcare institutions use mobile applications to track COVID-19 
patients as a means to help them control the spread of this pandemic. This kind of application and its data are 
profoundly significant. There must be a law to be enforced to ensure and protect the privacy of patients, 
banking subtleties, shopping records, and so forth. The governments or a trusted third party as in [3], they 
forced to exploit encryption models to force data security and privacy laws on COVID-19 tracking 
applications to guarantee the privacy and safety of the society during this outrageous pandemic [4]. 

For the protection of the patient's sensitive data and records, the timely sharing of this data among 
physicians and healthcare institutions of many encryption methods have been deployed in the literature to 
secure transmission and sharing of this sensitive data in a secure manner while preserving the privacy of the 
patients. Regardless of the fact that data encryption or decryption systems are the traditional way to protect 
and secure transmission and sharing of data from hackers for a large number of years. At the moment 
encryption methods are the most broadly utilized procedures to protect sensitive data transmission. The 
symmetric encryption methods are extensively used [5]-[10], where only one key is utilized for encryption 
and decryption of data such as Hill Cipher, data encryption standards (DES), 3DES, and advanced encryption 
standard (AES). Other encryption and decryption methods were deployed where at least two distinctive yet 
related keys are utilized; one for encryption and another for decryption. These methods are commonly called 
public-key systems. The Symmetric methods are relatively faster and increasingly difficult to break, yet the 
key distribution is a major concern in such methods. Although the Rivest-Shamir—Adleman (RSA) is one of 
the well-known public-key systems it is not effective for encryption of huge amounts of medical data but it is 
increasingly advantageous to managing key [5]. On the other hand, the asymmetric methods are slower and 
less secure, yet proved its advantageous for key distribution issues [5], [11]. 

The main objective of this paper is to propose an encryption algorithm for COVID-19 patients’ 
records or any medical records to be stored and transmitted securely while preserving the privacy of the 
patient's data. The proposed encryption algorithm has been developed using a deoxyribonucleic acid (DNA) 
random sequence and performs a set of fast encryption operations in several rounds using different keys 
generated for each round. The key generation in the proposed algorithm is randomly selected based on DNA 
and mapping table to ensure a high level of confusion and diffusion in the proposed algorithm. 

The rest of the paper is organized as follows: section 2 examines the most recent related work. The 
proposed encryption algorithm is presented in section 3. Experimental results and evaluation of the proposed 
model presented in section 4. Finally, the conclusion about this paper is drawn in section 5. 


2. RELATED WORK 

Several methods have been introduced in the literature to tackle the issue of creating strong and fast 
encryption methods for medical data such as COVID-19 patients’ data where data can be textual, images, and 
relational data. In this section, we present and discuss some of the most related work: A hybrid encryption 
model was proposed in [10], where they proposed a P-AES encryption algorithm for medical data records 
that are stored in the Cloud. Authors implied that most medical healthcare institutes hold their data in the 
Cloud about their patients. Medical data storage in the Cloud allows real-time sharing of the data among 
physicians and researchers. To overcome the security problem of storing medical records in the Cloud, the 
authors proposed the P-AES algorithm. The P-AES algorithm is a combination of improved AES and RSA to 
ensure the secrecy of data. They used an improved AES to encrypt and decrypt data while the key 
distribution was handled using the RSA. Unfortunately, the P-AES algorithm can only be applied to textual 
data and it is not suitable for images such as X-RAY and CT for COVID-19 patients. 

Attribute-based encryption algorithms to secure sharing of medical data in [12]. The algorithm is 
based on using public-key encryption for physicians to encrypt patient data and message communication 
between the Cloud and physicians. Many attributes were used in their encryption algorithm to differentiate 
who can access medical data and Cloud servers where data resides to ensure security while preserving the 
privacy of patient medical data as in [13]. 

To protect the confidentiality of patients' medical data an approach based on using steganography 
was proposed in [14], to ensure that such is not processed or altered by unauthorized users. Their approach is 
based on using double based pixels allocation and three Bit Invert System before encrypting medical data to 
improve security level. The authors also applied affine cipher to encrypt medical data and utilized Huffman 
coding to reduce the data encryption before the embedding process to increases payload ability. 

An investigation of using homomorphic encryption to preserve the privacy of genomic data in [1]. 
The authors used cryptographic keys to encrypt genomic variants data to apply it in the i2b2 framework to 
enable researchers to use such data. The authors claimed that the use of homomorphic encryption not only 
ensures the confidentiality but also preserved the privacy when quires are genomic data at i2b2 but such i2b2 
framework is accessed through a local network. 
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Gabetta et al. [15], the authors implemented a new encryption algorithm that is based on the use of 
four block ciphers to encrypt data. They used binary tree traversal for multi-bit word substitution and 2D 
array for the diffusion process. Such an approach requires a larger memory size and time-consuming 
regarding encryption time compared to AES. A similar approach in [16] was introduced to improve the 
security of Amazigh text. The approach implemented the elliptic curve cryptography (ECC) and the 
technique of binary tree traversal. The use of ECC still suffers from binary curves. 

An auto-generated key was introduced to encrypt images [17]. The method is based on a block 
cipher that exploited any digital file as a seed for a secret key generation. A block size of 32 was used with 
variable key size but yet the size of the key is breakable because of its small size. The digital file was used to 
create a substitution table to encrypt the image. Moreover, they embedded part of the encryption key within 
the encrypted image. Unlike our proposed encryption algorithm, we used multi-round encryption for the 
image where the key is generated for each round based on DNA and Map-table for the substitution and 
transposition processes. 

A multi-level DNA based encryption algorithm was introduced in [18], where they utilized DNA 
sequence or tape to generate the key based on the sensitivity of encrypted data. A random DNA selection 
process is based on blum blum shub (BBS). The BBS-DNA tape was exploited to produce new DNA tape to 
be used in substitution and transposition processes along with embedding DNA tape used in the encryption 
process to the encrypted data using the Hadamard matrix. 

Another DNA based encryption algorithm for text data proposed in [19], where the key is randomly 
selected from the mapping DNA table. The text data is converted to American standard code for information 
interchange (ASCII) representation and then the ASCII representation is converted into the original binary 
code. The DNA table is used to code its match in the binary representation of text data to be encrypted using 
simple substitution rules but the key used in this method is only 32 bits and because of the small size it is 
easy to break. 

An encryption method based on DNA for the memory limited devices to encrypt images is 
introduced in [20]. The method used the least signification bit (LSB) to embed patient information in images 
such as CT images. The image is then encrypted based on a set of rules formed based on DNA tape along 
with the use of a multi chaotic map. Other methods utilized the cloud and DNA sequence for creating strong 
encryption methods for protecting the security of data transmission [21]-[23]. Most of the abovementioned 
related work uses the DNA only to generate the key but in the proposed algorithm in this paper, we used 
DNA tape a mapping table to ensure a high level of confusion and diffusion in the proposed algorithm. In 
addition, most of the mentioned related work uses a single round but multi-round encryption with a different 
encryption key for each round will enhance security against any possible attacks. 


3. THE POROPOSED ENCRYPTION ALGORITHM 

It has been used and recommended by most well-known cryptosystems such as DES, 3DES, AES, 
and others [5], [24], [25]. To achieve a high level of protection for confidential data, it is necessary to take 
into account a set of factors: 

— Use the largest size possible for the used key 

— The key used should be as random as possible 

— Key transmission is easy between users 

— Encryption operations must be performed on data for several rounds 

— Different keys must be used in each round 

— Conducting the above factors should keep the encryption time at an acceptable level. 

Based on these factors, a proposed encryption algorithm has been developed using a random DNA 
sequence and to perform a set of fast encryption operations in a number of rounds using different keys 
generated for each round. Figure 1 shows the general model of the proposed encryption algorithm. To begin 
with, here are some data structures that are defined and presented to make it easy to understand the 
operations performed in each step of the proposed algorithm: 

— Source data (SD): a digital COVID-19 data file entered by a user to be encrypted. Initially, the source data 
is divided into 2-bit segments. Each 2-bit represents one of the four decimal numbers as shown in Table 1. 

— Source data length (SDLength): refers to the length of the source data after converting the source data to 
the corresponding sequence of numbers (0, 1, 2, and 3). The SDLength is calculated using (1). 


SDLength = (Source Data Length (in bytes)) x 4 (1) 
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— DNA sequence (DNA): refers to the DNA sequence used by the algorithm. The DNA sequence consists 
of a random sequence of four letters (A: Adenine, T: Thymine, C: Cytosine, and G: Guanine). The DNA 
letters are treated as a range of four decimal numbers (0, 1, 2, and 3) as shown in Figure 2(a). The 
numerical representation of the DNA sequence is depicted in Figure 2(b). DNA sequence is treated by the 
algorithm as random sequence of four decimal numbers (0, 1, 2, and 3) as shown in Figure 2(b). 

— DNA length (DNALength): refers to the length (in letter) of the DNA sequence. 

— Number of round (NRound): represents the number of rounds in which the encryption operations are 
performed in the algorithm. 

— Substitution index (SIndex): refers to the index of the part of the DNA sequence that is used during the 
substitution operation in the algorithm. 

— Transposition index (TIndex): refers to the index of the part of the DNA sequence that is used during the 
transposition operation in the algorithm. 


1. Substitution Index 
2. Transposition Index Pseudo-Random Generation 
3. Mapping Table Index 


Source Data 


Next Round 


Figure 1. General model of the proposed image encryption algorithm 


Table 1. 2-bit binary numbers and the corresponding decimal representation 
Binary Decimal 


00 0 
01 1 
10 2 
11 3 


DNA sequence 


Numerical representation of the 


DNA sequence 
0 |2 |3 |1 |O E a 
(b) 


Figure 2. The DNA letters are treated as a range of four decimal numbers as in (a) mapping table of DNA 
letters and the corresponding numbers, (b) DNA sequence and its numerical representation 


— Mapping table (MT): refers to a table that has two columns that link each of the DNA letters/numbers 
(0, 1, 2 and 3) to a non-repeated letter/number extracted from the DNA sequence. This table will be used 
during the transposition operation in the algorithm. Table 2 shows an example of the mapping table (MT). 

— Mapping table index (MIndex): refers to the index of the part of the DNA sequence that is used to build 
the mapping table (MT) used during the transposition operation in the algorithm. 

— Encrypted data (ED): a digital COVID-19 data file that is produced after completing the encryption 
operations on the source data. It is treated by the algorithm as a collection of bytes. 
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Table 2. An example of the mapping table (MT) 
DNA Letter _ Extracted Letters from DNA Sequence 


AAAS 
QHA 


As shown in Figure 1, the proposed encryption algorithm includes three main phases before 
encrypting the secret of the COVID-19 data. The user should transmit the key that will enter to the algorithm 
using a secret channel. This key consists of only three numbers (SIndex, TIndex and MIndex). These numbers 
represent three indices in a previously determined DNA file that is existed in one of the public DNA datasets. 


3.1. Pseudo-random generation phase 

To ensure that the values of SIndex, TIndex and MIndex used are different in each round of the 
algorithm, even the initial values entered by the user as input to the algorithm. A pseudo random generation 
algorithm has been adopted in the proposed algorithm that based on a Seed value calculated from the current 
SIndex, TIndex and MIndex to produce new random values. The Seed value is calculated using (2). 


Seed = (SIndex + TIndex + MIndex) (2) 


The calculated Seed value from in (2) is used by the adopted random generation algorithm to generate new 
values for SIndex, TIndex and MIndex. 


3.2. Preparation phase 
Three preparation operations are performed to set the necessary parameters for the encryption phase, 
the three operations are: 

— Set the DNA sequence to be used in the substitution operation. This sequence is starting from the SIndex 
index to the (SIndex+SDLength)-1. It will start from index 0 in the DNA sequence whenever the end of 
the DNA sequence is reached. 

— Set the DNA sequence to be used in the transposition operation. This sequence is starting from the TIndex 
index to the (TIndex+SDLength)-1. It will start from index 0 in the DNA sequence whenever the end of 
the DNA sequence is reached. 

— Build the mapping table (MT) by extracting a non-repeated letter/number from the DNA sequence 
starting from the index MIndex. The extracted letters/numbers are sequentially filled (from top to down) 
in the second column of MT. This yields to assign a random letter/number to each DNA letter/number. It 
will start from index 0 in the DNA sequence whenever the end of the DNA sequence is reached. 


3.3. Encryption phase 
Substitution and transposition are the two main operations performed in this phase. These operations 
help to achieve the necessary level of confusion and diffusion in the source data to produce encrypted data. 

The description of the two operations are as follows: 

— Substitution operation: This operation is conducted by performing XOR logical operation between the 
2-bit of each number in the SD with the corresponding 2-bit of the number in the DNA sequence starting 
from the index SIndex. Table 3 shows the truth table of the XOR logical operation. The confusion effect 
will occur in the SD as a result of this operation due to the changes made in SD bits. Every number in SD 
is changed many times by performing XOR operation with a different number in each round of the 
algorithm. 


Table 3. Truth table of XOR logical operation 
BitX BitY BitZ 
0 0 0 


0 1 1 
1 0 1 
1 1 0 


— Transposition operation: The following pseudo-code explains how to implement this operation. 
a. Firstl=TIndex; 
b. Lastl= (TIndex+SDLength)-1; 
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First2=0; 
Last2= SDLength-1; 
While (Start+End) 
Based on the MT table, 
If (DNA[First1] +> DNA[Last1]) then 
SWAP(SD[First2], SD[Last2]) 
Firstl= First1+1; 
Last1= Last1-1; 
First2= First2+1; 
. Last2= Last2-1; 
m. EndWhile 
The diffusion effect will occur in the SD as a result of this operation due to the changes in positions 
made in SD numbers. Every number in SD is transferred to different positions in SD in each round of the 
algorithm. The above three phases are repeated several times depending on the value of NRound. This will 
help to achieve more confusion and diffusion effects in the SD. The use of different random indices (SIndex, 
TIndex and MIndex) as a key in each round will add more difficulties in front of the attackers to guess all 
these numbers. 


Heee mo A 9 


4. RESULTS AND DISCUSSION 

Several COVID-19 datasets [26]-[29] were used in the experiments to test the performance of the 
proposed encryption algorithm. These datasets contain a large number of files of various types such as Excel 
sheets, images, and database table. Figure 3 shows examples of these files. The key size, encryption time, 
security level, information entropy and avalanche effect measurements are used to compare the quality of the 
proposed encryption algorithm with a set of well-known techniques such as DES and AES. 


RESEARCH INSTITUTE FOR TROPICAL MEDICINE aa 
1 of 1 Filinvest Corporate City Compound 
iy sts mmama | SUMMARY OF CONFIRMED 
Ui 2 
Vay HEALTH CARE DELIVERY 
4 LABORATORY RESULT FORM poenos | COVID-19 CASES IN CENTRAL VISAYAS 
age Loti | 
[Name of Patient: ATAYDE, ARTURO NIETO | Report as of 29 March 2020 
lAge/Sex: 54 /MALE j Date of Birth: 1103/1966 y 
Patient Location: RITM Hospital No: No information ] 
1 100 RIT s | 
peta Cv21-cc VSMMC SNL NM CEBU CITY ADMITTED 
Requisitioner: KATRINA MEJIA Address: CARDINAL SANTOS MEDICAL CENTER 1 
Specimen Type: NPS/IOPS \ccession No: laboratory No: COV20-06058 | cv2-ce VSMMC SNL 640M CEBU CITY ADMITTED 
Date and Time of Specimen Collection: 24/03/2020 |Date and Time of Specimen Receipt: 25/03/2020 
tam a mas 23:51 oiana a sen 1436 
d Time of Release of Result: 31/03/2020 01 14 | cv23-cc VSMMC SNL 68 F CEBU CITY ADMITTED 
Aadu- | 
LABORATORY TEST RESULT J 
LABORATORY TEST PERFORMED: SARS-Cc OVI0-19) virus | CV24-LL VSMMC SNL 75M LAPU-LAPU CITY ADMITTED 
detecta e Chain Reaction 
Cv 25-CC VSMMC SNL 70:M CEBU CITY 28/2020» DIED 
TEST RESULT: SARS-CoV-2 (causative agent of COVID-19) viral RNA detected | 
RESULT ANO UNITS OR MEASUREMENTS: NONE | CV 26- Cr VSMMC SNL 7 F CEBU CITY ADMITTED 
BIOLOGICAL REFERENCE INTERVALS: NONE 
INTERPRETATION OF RESULTS WHEN APPROPRIATE CV 27- NO* VSMMC SNL 62: M NEGROS ORIENTAL ADMITTED 
Final Result interpretation ] 
ISARS-CoV-2 (causative agent of COVID-19) viral Positive for SARS-CoV-2 (causative agent of | DUMAGUETE CY, HOME 
JANA detected CV 28- NO* VSMMC SNL 49M NEGROS ORIENTAL QUARANTINE (lL) 
SARS-CoV-2 (cau ‘of COVID-19) viral | 
wall wO ve a| CV 29- CC* VSMMC SNL 45°M CEBU CITY ADMITTED 
| CV 30- cP VSMMC SNL 84M CEBU PROVINCE ADMITTED 
This laboratory result should be interpreted together with the available clinical and epidemiological information. | 
Sanaa . “Hew Ci 19 casos 
COMMENTS: 1 OVID- 
f DOHTgovph W @DOH7gowph $) ro7doh.govph t (032) 402-1269 (032) 402-3091 


Figure 3. Examples of COVID-19 data files 


4.1. Key size 

The user's key that is entered into the proposed encryption algorithm consists of four integers 
(SIndex, TIndex, MIndex, and NRound) as mentioned before. Each integer represents 16 bits. This means that 
the total number of bits of the four integers is equal (16 bitx4=64 bits); because the algorithm is implemented for a 
number of rounds. Therefore, it uses different SIndex, TIndex and MIndex in each round. 

In addition, the existence of thousands of DNA files available in global databases that can be 
previously determined by the user to use in the algorithm. Therefore, the total number of bits of the key can 
be estimated using (3). If we assume that the algorithm is implemented for only five rounds except the bits 
needed to represent the available number of DNA files in the global databases. This means that the calculated 
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Key Size from (3) is equal ((48 bitsx5)+16 bits)=436 bits. Certainly, this is sufficient to make it difficult to 
break the key by attackers. 


Key SiZepit = ((48 bits x NRound) + 16 bits + (log (Number of Available DNA)/log (2))) (3) 


When comparing the size of the key used by the proposed algorithm with a set of well-known 
encryption techniques. It shows that the proposed encryption algorithm succeeded in using the largest 
possible size of the key. Table 4 presents the size of the key used by the proposed encryption algorithm and 
other well-known encryption techniques. 


Table 4. Size of key used in the proposed algorithm and well-known encryption techniques [9], [19] 
Encryption System Size of the Key (bit) 
Proposed algorithm Much more than 64 


Blowfish 32 
AES 256 
DES 56 
3DES 168 

Twofish 128 


4.2. Encryption and decryption time 

One of the main challenge points in developing a multi-round secure encryption method is keeping 
the encryption time at an acceptable rate. The development of the proposed encryption algorithm has taken 
into account the use of rapid operations, where the XOR logical operation is used instead of any other math 
operation that may take a variety of calculations. Also using the small MT mapping table helps reduce the 
time needed to implement the algorithm. 

A large number of files of various types that are taken from COVID-19 datasets, for example (Excel 
sheets, images, and database tables) were encrypted during the experiments. These files have been encrypted 
using the proposed encryption algorithm and using the two well-known encryption techniques DES and AES. 
Tables 5 and 6 show the encryption and decryption times that were recorded in the experiments (using 
different types of files of different sizes) to achieve the same confusion and diffusion rate in the encrypted 
data by using the proposed algorithm and the two methods DES and AES. The encryption and decryption 
times recorded by the proposed encryption algorithm are clearly competitive. This encourages the use of the 
proposed encryption algorithm in the area of information protection. 


Table 5. The encryption time of the proposed Table 6. The decryption time of the proposed 
algorithm and the two methods DES and AES algorithm and the two methods DES and AES 


Encryption time (sec) 
Proposed DES AES 


Decryption time (sec) 


Data Data Size (KB) Proposed DES AES 


Data Data Size (KB) 


Datal 2.25 1.982 2.89 2.45 Datal 2.25 1.871 2.80 2.30 

Data2 1.199 1.231 2.41 1.917 Data2 1.199 1.199 2.35 1.878 
Data3 196 0.238 0.500 0.352 Data3 196 0.201 0.458 0.332 
Data4 201 0.368 0.327 0.501 Data4 201 0.342 0.300 0.480 


4.3. Security level 

Two types of tests were used in the experiments to assess the quality of the proposed encryption 
algorithm regarding the level of protection achieved. A good encryption method is the one that can achieve 
the highest possible ration of confusion and diffusion effects in the encrypted data. These effects can be 
measured numerically by calculating the peak signal to noise ratio (PSNR) using (4) and (5) respectively. 
Table 7 shows the PSNR recorded in the experiments (using different types of files of different sizes) by 
implementing the proposed algorithm and the two methods DES and AES. 


ySDLength-1) op (ED KI 


— k=0 
NMAE = ee i x 100 (4) 
2 
PSNR = 10.logi9 (==) (5) 


Where: MaxSD is the maximum possible byte value in the source data SD and db refers to a decibel. 
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Obviously, the PSNR recorded by the proposed encryption algorithm during the experiments is very 
close to other known encryption methods DES and AES. To clarify the progress of confusion and diffusion 
effects that are achieved in the encrypted data in each round, one image file chosen and implement the 
proposed algorithm for a number of rounds. The encrypted images produced in these rounds have been 
depicted in Figure 4 (a)-(c). 

Also, the confusion and diffusion effects achieved can be statistically tested using the histogram of 
byte values in source data compared with byte values in encrypted data. A good encryption method is the one 
that achieves the highest rate of flatness in the encrypted data histogram. Figure 5 (a)-(d) show examples of 
the histograms of source and encrypted data generated using the proposed algorithm and the two methods 
DES and AES. The success of achieving a good flatness in the histogram of values of bytes by the proposed 
algorithm (similar to DES and AES methods) means that there is good confusion and diffusion effects are 


happening in the encrypted data. 


Table 7. The PSNR values of the proposed algorithm and the two methods DES and AES 


Data Data Size (KB) eae ee ) AES 
Data 1 2.25 5.501 5.462 5.401 
Data 2 1.199 5.625 5.412 5.403 
Data 3 196 5.423 5.417 5.404 
Data 4 201 6.533 6.546 6.543 


(b) 


Figure 4. The implementation of the proposed algorithm in a number of rounds: (a) one round, 
(b) three rounds, (c) ten rounds 
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Figure 5. Histogram of (a) source data, (b) encrypted data using the proposed algorithm, (c) encrypted data 
using DES method, and (d) encrypted data using AES method 
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4.4. Information entropy 

Information entropy is an essential feature of the randomness of an input image. Entropy is the 
average (expected) amount of information from the data [30], [31]. For a digital image, it is hard to predict 
the content if its information entropy is high. Here, the entropy is calculated by (6). 


Entropy = — YL, Pi. log, (Pi) (6) 


where n is the number of different data values and P; is the probability of occurring the data value. 

The information entropy of the source and encrypted data are listed in Table 8. The entropy values 
in Table 8 indicate that the proposed encryption algorithm achieved a competitive level of randomness in the 
encrypted data compared with the other known encryption methods DES and AES. Furthermore, the 
information entropy value of the proposed algorithm is almost 8, which shows that it is complicated to carry 
out a successful attack. 


Table 8. Entropy values of the source and encrypted data 


Data Data Size Entropy 

(KB) Source Proposed DES AES 
Data 1 2.25 4.171996 7.991457 7.999849 7.999841 
Data 2 1.199 4.177464 7.998691 7.999015 7.998972 
Data 3 196 4.182701 7.999013 7.999332 7.999311 
Data 4 201 4.266782 7.998982 7.998990 7.998995 


4.5. Avalanche effect 

The avalanche effect test is a numeric metric used to check the sensitivity of the encryption method 
to any small changes in the parameters [32]. To develop a high-quality encryption method, it should take into 
consideration that when there is a slight change in input (either in the key or the source data), this should 
cause significant changes in the encrypted data. As shown in (7) is used to calculate the number of bits that 
will be changed in the encrypted data when a few bits changed in the encryption key. 

Table 9 shows the calculated values of the avalanche effect using (7) for "Data 2". Table 10 shows 
the average value of the avalanche effect during the experiments on different data files using the proposed 
method compared with those values of the avalanche effect test for DES and AES encryption methods. The 
results in Tables 9 and 10 confirm that the proposed method achieved an acceptable average value of the 
avalanche effect compared to those of DES and AES algorithms. 


Number of changed bits in key used 


Aval. Effect = 


(7) 


Total number of bits in encrypted data 


Table 9. The recorded values of the avalanche effect for "Data 2" 
Number of bits changed in the encryption key Avalanche test value (%) 


1 50.133 
3 50.222 
10 50.407 


Table 10. The average of avalanche effect values for the proposed algorithm and 
the two methods DES and AES 


Encryption Method Average of the avalanche effect values (%) 
Proposed 50.254 
DES 50.236 
AES 50.344 


5. CONCLUSION 

In this paper, we introduced a multi-round encryption algorithm based on DNA for COVID-19 
medical data. The algorithm relies on the use of multi-round encryption, wherein each round, the key is 
randomly selected based on DNA and mapping table to ensure a high level of confusion and diffusion of the 
proposed algorithm. The proposed algorithm proved a superior performance regarding key size, time of 
encryption compared to DES and AES. The low PSNR value proved that the proposed algorithm achieves the 
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highest rate of flatness in the encrypted data histogram value and uniform histogram. Security analysis using 
information entropy and avalanche effects shows that the proposed algorithm has a high level of security and 
can endure all types of attacks compared with other algorithms. 
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